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ABSTRACT 

In this demo, we present PackageBuilder, a system that ex¬ 
tends database systems to support package queries. A package is 
a collection of tuples that individually satisfy base constraints and 
collectively satisfy global constraints. The need for package support 
arises in a variety of scenarios: For example, in the creation of 
meal plans, users are not only interested in the nutritional content 
of individual meals (base constraints), but also care to specify daily 
consumption limits and control the balance of the entire plan (global 
constraints). We introduce PaQL, a declarative SQL-based package 
query language, and the interface abstractions which allow users to 
interactively specify package queries and easily navigate through 
their results. To efficiently evaluate queries, the system employs 
pruning and heuristics, as well as state-of-the-art constraint opti¬ 
mization solvers. We demonstrate PACKAGEBUILDER by allowing 
attendees to interact with the system’s interface, to define PaQL 
queries and to observe how query evaluation is performed. 

1. INTRODUCTION 

Traditional database queries define constraints (selection predi¬ 
cates) that each tuple in the result needs to satisfy. Although they 
are undoubtedly expressive and powerful, they prove inadequate in 
scenarios that require a set of answer tuples to satisfy constraints 
collectively. Such scenarios arise in a variety of applications: 

Meal planner: An athlete needs to put together a dietary plan in 
preparation for a race. She wants a high-protein set of three gluten- 
free meals for the day, having in total between 2,000 and 2,500 
calories. It is easy to exclude meals with gluten, as this condition can 
be enforced on each individual meal (tuple) with a regular selection 
predicate. The other constraints (e.g., total calories), however, need 
to be verified collectively over the entire package. 

Vacation planner: A couple wants to organize a relaxing vacation 
at a tropical destination. They do not want to spend more than 
$2,000 on flights and hotels combined. They also want to be in 
walking distance from the beach, unless their budget can fit a rental 
car, in which case they are willing to stay farther away. Building the 
ideal vacation package is challenging as the choice of hotels affects 
the choice of flights and car rentals. 

Investment portfolio: A broker wants to construct an investment 
portfolio for one of her clients. The client has a budget of $50K, 
wants to invest at least 30% of the assets in technology, and wants 
a balance of short-term and long-term options. The broker cannot 
select each stock option individually, but rather needs to find a stock 
package that satisfies all these constraints collectively. 

These examples cannot be expressed by traditional SQL queries. 
We demonstrate PACKAGEBUILDER, a system that augments 
database functionality to support the creation of packages. A 


package is a collection of tuples that individually satisfy base 
constraints and collectively satisfy global constraints. The base 
constraints are equivalent to regular selection predicates, and can 
be evaluated individually for each tuple. For example, in the meal 
planner application, the gluten-free restriction is a base constraint, 
as it can be verified independently on each meal. In contrast, the 
requirement that the total amount of calories should be between 
2,000 and 2,500 cannot be evaluated on each meal individually, 
but needs to be assessed over a collection of meals. Our system 
addresses three main challenges: 

Language specification: Even though many use cases motivate 
support for package queries, this class of queries remains largely 
unsupported, with few tools targeting domain-specihc packages 
(e.g., CourseRank supports building course packages J^). As part 
of this demonstration, we will present PaQL, a declarative query 
language that supports package specifications. PaQL is designed 
with simple extensions to standard SQL. Those familiar with SQL 
should find it intuitive and easy to use (Section|^. 

Interactive specification: Even traditional SQL queries can often 
be challenging to specify for novice users. To enable user-friendly 
database applications, several systems now employ application- 
independent visual metaphors for SQL query specification |12[ [8] 
m Package queries are fundamentally harder to express and eval¬ 
uate compared to traditional SQL. Therefore, it is increasingly im¬ 
portant to provide visual paradigms to guide users through query 
building and through navigating and refining the results. PACK¬ 
AGEBUILDER helps users to interactively specify base and global 
constraints for their packages. The system interface also allows 
users to visually navigate through the solution space and to easily 
refine the resulting packages (Section|^. 

Evaluation: In traditional database queries, the size of the answer 
is polynomial in the size of the input data. This is not true for 
package queries: If n tuples satisfy the base constraints of a pack¬ 
age query, there are fl(2") candidate packages that can potentially 
satisfy the user’s global constraints. This makes the evaluation 
of package queries particularly challenging. With an exponential 
search space, efficiently searching for packages that satisfy the 
users’ constraints requires applying non-trivial pruning techniques 
and search heuristics (Section]^. 

We proceed to describe the main three aspects of our system that 
are motivated by these challenges. We conclude with a description of 
a demonstration scenario that is illustrative of the system (Section|^. 

2. PaQL: PACKAGE QUERY LANGUAGE 

Our PackageBuilder system extends traditional database func¬ 
tionality to provide full-fledged support for packages. We identify 
two important reasons to support packages at the database level. 
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Figure 1: The visual interface of PackageBuilder provides different visual representations of packages, and allows the user to 
interactively manipulate package queries. 


rather than at the application level: (a) The data used to construct 
packages typically resides in a database system, and packages them¬ 
selves are structured data objects that should naturally be stored in 
and manipulated by a database system, (h) The features of packages 
and the algorithms for constructing them are not unique to each 
application; therefore, the burden of package support should be 
shifted away from the application developers. 

We designed PaQL, a declarative SQL-based query language for 
specifying package queries. The following query builds the athlete’s 
daily meal plan described in Section[T] 


SELECT 

FROM 

WHERE 

SUCH THAT 

MAXIMIZE 


PACKAGE{R) AS P 

Recipes R 
R.gluten = ‘free’ 

COUNT(*) = 3 AND 

SUM(P.calories) BETWEEN 2000 AND 2500 
SUM(P.protein) 


The introduction of the keyword PACKAGE differentiates PaQL 
queries from traditional SQL queries. Semantically, PACKAGE 
constructs multisets from subsets of tuples from the base relations 
listed in the FROM clause. With no further constraints, there are 
infinitely many packages that can be built from non-empty base 
relationsH Users can limit the number of times a tuple from the input 

'This assumes that a tuple from an input relation can appear multiple 
times in the package result. 


relation R can appear in the package result by adding a REPEAT 
keyword in the FROM clause. For example, “FROM Recipes R 
REPEAT k” would allow a tuple to be repeated up to k times in a 
package. 

A package query defines two types of constraints. Base con¬ 
straints, defined in the WHERE clause, are equivalent to selection 
predicates and can be evaluated with standard SQL: any tuple in the 
package needs to individually satisfy all the base constraints. In the 
example query, the base constraint “R.gluten = ‘free’” specifies that 
each meal in the package should be gluten-free. Global constraints 
are defined in the SUCH THAT clause. They express higher-order 
predicates: tuples in a package need to collectively satisfy all global 
constraints. This means that a global constraint is a property of 
a package, not of a single tuple. For example, “COUNT(*) = 3” 
specifies that the entire package should have exactly 3 meals. PaQL 
also allows the expression of sub-queries in the SUCH THAT clause. 
In contrast with base constraints, global constraints cannot be ex¬ 
pressed by traditional SQL queries. 

The objective clause, MAXIMIZE (or MINIMIZE), is unique to 
packages as well: it specifies that out of all packages that satisfy 
the base and global constraints, the ones with larger value in the 
MAXIMIZE clause are preferable. A detailed description of PaQL 
can be found online (Tj. 

































3. INTERFACE ABSTRACTIONS 

Package queries are more complex, semantically and algorith¬ 
mically, compared to traditional database queries, and they pose 
challenges on several fronts: (a) they can have complex specifi¬ 
cations, (b) they produce a large number of results, which poses 
usability challenges, and (c) they are computationally intensive to 
evaluate. We discuss the third challenge in Section]^ In this section, 
we describe several interface abstractions that PackageBuilder 
implements to address the first two challenges. 

3.1 Specification 

Our package template abstraction encodes package specifications 
in a familiar tabular format (Figure [T] shows a screenshot exam¬ 
ple). The central component of the template is a sample package, 
presented as a scrollable table. Additional components include rep¬ 
resentations of base and global constraints, optimization objectives, 
and suggestions for additional package refinements. As a user in¬ 
teracts with the template by highlighting elements in the sample 
package, PackageBuilder suggests constraints m- For exam¬ 
ple, when the user selects a cell within the “fats” column, the system 
proposes several constraints that would restrict the amount of fat in 
each meal, and objectives that would minimize the total amount of 
fat. The package template is quite expressive but is not as powerful 
as the PaQL language itself. The abstraction tries to strike a balance 
between ease-of-use and expressive power. 

3.2 Presentation 

In addition, PACKAGEBUILDER presents packages in a way that 
allows users to meaningfully view the entire package space, without 
having to actually examine it in its entirety (see the visual summary 
at the bottom of Figure[TJ. The system analyzes the current query 
specification and selects two dimensions to visually layout the valid 
packages along. Users can use the visual summary to navigate 
through the available packages by selecting glyphs that represent 
them. 

3.3 Adaptive exploration 

Many users may prefer specifying queries in trial-and-error, in¬ 
cremental form, rather than providing a complete and precise speci¬ 
fication from the very beginning. To facilitate this approach, PACK¬ 
AGEBUILDER initially presents a sample package that satisfies a 
few basic constraints. Users can then select good tuples within the 
sample, and request a new sample that replaces the unselected tu¬ 
ples. Users can repeat this process until they reach the ideal package. 
PackageBuilder uses these selections to narrow the search space 
as well as to identify additional package constraints. 

4. EVALUATION 

Evaluating package queries is nontrivial: even if the package 
query does not allow duplicate tuples, the number of valid packages 
is in the worst case exponential in the number of base tuples. In fact, 
any subset of the base tuples may potentially be a valid package. 
A brute-force approach that generates and evaluates all candidate 
packages is thus impractical. 

PackageBuilder is an external module which communicates 
with the DBMS, where the data resides, via SQL. To evaluate a pack¬ 
age query, the system parses a PaQL query and performs a search to 
generate valid packages. The system either: (i) uses SQL statements 
to generate and validate candidate packages; or (ii) translates pack¬ 
age queries to constraint optimization problems, and employs state- 
of-the-art constraint solvers to derive valid packages. At the heart of 
the query evaluation system, PACKAGEBUILDER uses and extends 


the Tiresias query engine j^. Even though PACKAGEBUILDER uses 
the Tiresias query engine, it has several differences: 

• Package queries specify tuple collections (packages), whereas 
Tiresias’ how-to queries specify updates to underlying datasets. 

• PackageBuilder allows a tuple to appear multiple times in a 
package result; this does not map to any operation in Tiresias. 

• PaQL is SQL-based whereas Tiresias uses a variant of Datalog. 

• PackageBuilder supports arbitrary Boolean formulas in the 
SUCH THAT clause, whereas Tiresias only supports conjunctive 
how-to queries. 

• PackageBuilder employs additional heuristic and pruning 
techniques to increase the efficiency of package queries. 

We proceed to describe, at a high level, some of the extensions to 
Tiresias used in PACKAGEBUILDER to evaluate package queries. 

4.1 Cardinality-based pruning 

With pruning techniques, the system can avoid generating can¬ 
didate packages that cannot possibly satisfy some of the global 
constraints. Given a global constraint C, our pruning strategy iden¬ 
tifies a lower cardinality bound I and an upper cardinality bound 
u for any package that can satisfy C- For example, if C is de¬ 
fined as fl < COUNT(*) < b, the cardinality bounds are trivially 
I = a and u = b. As another example, consider the global con¬ 
straint on total calories per package: 2000 < SUM (calories) < 2500. 
In this case, the cardinality bounds are / = [ MAX^rabries) 1 

u = L pTN^UTiii) J ■ ^ ■'Snipes with MAX(calories) 

and at most u recipes with MIN(calories) we can achieve both the 
lower and upper bounds of the summation constraint. 

Assuming queries that do not allow repeated tuples, if n tuples 
satisfy the base constraints, this pruning approach reduces the search 

space from 2" to (") -f (,"i) H-h („"i) + (()), without losing any 

valid solution. 

4.2 Heuristic local search 

Pruning often reduces the search space significantly, but this re¬ 
duction alone is seldom sufficient. In addition to pruning algorithms, 
which reduce the search space while maintaining completeness, 
PackageBuilder employs a heuristic local search to hasten the 
computation. As with any heuristic, there is no guarantee that all 
valid solutions will be found. Given a starting package Pq (which 
can be constructed, for example, at random), PACKAGEBUILDER 
identifies all possible L-tuple replacements that can lead to a valid 
package, by using a single SQL query. For example, suppose we 
wish to generate meal packages with less than 2,500 total calories. 
Given a package Pq having a total of 3,000 calories, we can identify 
all possible single-tuple replacements which lead to valid packages 
with the following SQL query: 

SELECT Po-id, P.id 

FROM Po. Recipes P 

WHERE 3000 — Pq. calories + P.calories < 2500 

This query implements a greedy heuristic that is only able to 
locate valid packages that differ from Pq by one single tuple. It fails 
to find any valid package that differ from Pq by more than one tuple. 
The query can be also modified to explore packages of different 
cardinalities in a straightforward way. Notice that the query is a 
selection over a Cartesian product between the candidate package 
and the recipe relation. This approach is very efficient if we are 
attempting to replace only a few tuples at a time. For k replacements, 
however, this method would require a Ik-way join, which quickly 
becomes intractable. 




This local search is also particularly useful for adaptive explo¬ 
ration (Section [T3| l, where users usually request the replacement of 
only a few tuples at a time. 

5. CHALLENGES 

Package queries pose a series of new challenges on database 
query engines. We discuss here a few of the research directions that 
we plan to explore. 

Optimizing PaQL queries: Our experience with Package- 
Builder shows that each of the evaluation techniques we 
adopted have different strengths and weaknesses. Currently, 
PackageBuilder heuristically combines all of them to efficiently 
derive packages. However, a more principled approach to package 
query optimization could add several benefits to the query engine. 
Solver limitations: Constraint solvers are typically limited to re¬ 
turning a single package solution at a time, and retrieving more 
packages requires modifying and re-evaluating the query. Moreover, 
solvers cannot usually handle non-linear global constraints; hence 
evaluating such queries requires different methods. 

Diverse package results: The number of solutions to a package 
query can potentially be extremely large, making it harder for users 
to explore the package space and find interesting packages. We plan 
to devise techniques to present the user with the most diverse and 
potentially interesting packages, extracted from the space of valid 
or invalid solutions. 

6. RELATED WORK 

Package queries are instances of constraint satisfaction prob¬ 
lems (10| , a well-known class of NP-complete problems. Package 
queries can be used to provide set-based recommendations, such 
as those available in CourseRank [0. PaQL offers a more gen¬ 
eral framework for package recommendations. For instance, it can 
easily express pre-requisite constraints typical of course package 
recommendation systems. PackageBuilder extends the Tiresias 
query engine Q with several new features, which we discussed in 
SectionlD 

Package queries are also related to skyline queries |4|, top-fc 
queries Q and multi-objective queries j^, in their intent to let users 
filter a set of objects based on optimization objectives. However, 
PaQL queries differ from them for several reasons. First, the opti¬ 
mization objectives of a skyline query are per tuple, rather than per 
package. This makes it possible to express a skyline query with tra¬ 
ditional nested SQL Q, whereas the global constraints expressible 
in PaQL are not expressible in traditional SQL. Secondly, PACKAGE¬ 
BUILDER supports one single per-package optimization objective, 
as opposed to multiple per-tuple objectives supported by multi¬ 
objective and skyline queries, and does not support top-k queries. 
Finally, multi-objective queries (comprising skyline queries) return 
the set of non-dominated objects. PACKAGEBUILDER, instead, 
returns the optimal package for any given query. 

7. DEMO SCENARIO 

VLDB attendees visiting our booth will learn, test, and use PACK¬ 
AGEBUILDER. We will demo PACKAGEBUILDER on a real-world 
application: the meal planner. Meal planner has a rich recipe data 
set scrapped from online recipe and nutrition websites. Attendees 
will observe how packages are specified with the package template, 
and interactively refined with adaptive exploration. In addition, a 
quick tutorial will highlight the key features of PaQL and describe 
the query engine optimizations we employ to optimize the pack¬ 
age search process. For instance, we will show how a PaQL query 


is translated into a linear program and then solved using existing 
constraint solvers. To control booth crowds, we will provide video 
tutorials and online guides, and make them accessible through hand¬ 
held devices present at the booth. Attendees can choose to learn 
about PackageBuilder either by using these materials or by 
interacting with the presenters. 

Attendees will then test PACKAGEBUILDER by building their own 
recipe packages using either the visual interface or PaQL. Attendees 
can save their packages, as well as share their results through tweets 
or emails. The meal planner will be accessible online throughout 
the duration of the conference and users will be able to revise their 
saved packages at any time. 
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